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Abstract 

In  this  paper  we  examine  the  efficacy  of  using  the  closest  distance  to  center  algorithm  in  conjunction  with  ellipsoidal 
multivariate  trimming  (MVT)  to  find  outliers  in  a  hyperspectral  image.  MVT  is  applied  here  as  a  global  anomaly  detector 
on  images  that  are  pre-processed  into  clusters  using  a  technique  called  X-means.  Under  the  assumption  that  there  are 
no  more  than  5%  outliers  in  any  given  cluster  set,  we  develop  a  method,  based  upon  principal  component  analysis  pre¬ 
processing,  to  create  a  flexible  threshold  for  determining  the  percentage  of  data  to  retain  with  MVT.  Using  a  retention 
percentage  that  more  adequately  reflects  the  actual  number  of  outlier-free  observations  allows  one  to  form  estimates 
of  the  mean  and  covariance  matrix  that  more  effectively  decrease  the  effects  of  swamping  and  masking  as  compared  to 
using  a  set  percentile  for  retention.  These  ideas  are  tested  against  real  and  synthetically  generated  hyperspectral  imagery. 
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I.  Background 

This  paper  deals  with  the  military  application  of  hyper¬ 
spectral  imagery  (HSI).  A  succinct,  well-written  summary 
of  the  military  utility  of  HSI  is  found  at  GlobalSecurity. 
org.1  An  excerpt  follows: 

Hyperspectral  imaging  technology  uses  hundreds  of  very 
narrow  wavelength  bands  to  ‘see’  reflected  energy  from 
objects  on  the  ground.  This  energy  appears  in  the  form  of 
‘spectral  fingerprints’  across  the  light  spectrum  and  enables 
collection  of  much  more  detailed  data  and  produce  a  much 
higher  spectral  resolution  of  a  scene  than  possible  using  other 
remote  sensing  technologies. 

Once  these  fingerprints  are  detected,  special  algorithms — 
repetitive,  problem-solving  mathematical  calculations — then 
assess  them  to  differentiate  various  natural  and  manmade 
substances  from  one  another.  ‘Signature’  libraries  may  also  be 
used  to  identify  specific  materials — e.g.,  rooftops,  parking 
lots,  grass,  or  mud — by  comparing  a  library’s  pre-existing 
reference  catalogs  with  freshly  taken  hyperspectral  images  of 
the  battlefield  from  space. 

Image  processing  equipment  then  portrays  the  various  types  of 
terrain  and  objects  upon  it  in  different  colors  forming  a  ‘color 


cube,’  each  based  on  the  wavelength  of  the  reflected  energy 
captured  by  the  image.  These  colors  are  subsequently 
‘translated’  into  maps  that  correspond  to  certain  types  of 
material  or  objects  to  detect  or  identify  military  targets  such 
as  a  tank  or  a  mobile  missile  launcher.  Algorithms  can  also 
categorize  types  of  terrain  and  vegetation  (useful,  for  example, 
in  counter-narcotic  operations),  detecting  features  such  as 
disturbed  soil,  stressed  vegetation,  and  whether  the  ground 
will  support  the  movement  of  military  vehicles. 

Once  this  technology  is  mature,  theater  commanders  can  use 
mobile  ground  stations  to  process  in  real-time  information 
transmitted  by  the  satellite,  critical  to  theater  commanders  for 
them  to  keep  pace  with  rapidly  changing  conditions. 
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More  detailed  discussions  are  found  in  Ardouin  et  al.2 
and  Briottet  et  al.3  The  background  material  given  above 
reflects  a  general  application  area  known  as  signature 
matching.4  The  application  of  such  algorithms  is  compli¬ 
cated  by  the  need  to  convert  the  collected  sensor  data  into 
the  spectra  of  the  material  of  interest.  The  sensor  collects 
what  is  known  as  spectral  radiance.  Radiance  is  modulated 
by  atmospheric  effects,  such  as  the  absorption  of  the  energy 
of  certain  spectral  bands  and  the  superposition  of  solar 
energy  scattered  by  the  atmosphere  on  to  the  light  reflected 
by  an  object.  The  spectra  of  the  material  of  interest  are  mea¬ 
sured  in  terms  of  what  is  called  emissivity  or  reflectance.  It 
can  be  thought  of  as  the  spectral  signature  of  the  material  as 
collected  under  laboratory  conditions  where  the  effects  of 
the  atmosphere  have  been  factored  out.  The  application 
of  signature-matching  algorithms  requires  the  conversion 
of  the  radiance  into  reflectance.  This  process  is  known  as 
atmospheric  calibration.  Now,  if  the  complications  due  to 
weather  are  compounded  with  the  interest  in  several  differ¬ 
ent  objects,  each  of  which  may  be  associated  with  multiple 
signatures,  there  may  be  a  desire  to  apply  what  are  known 
as  anomaly  detectors.  These  seek  to  find  observations  that 
are  different  from  typical  background  materials  without 
using  specific  target  signatures.4 

In  this  paper,  we  propose  a  new  anomaly  detector.  The 
method  can  be  described  as  a  global  version  of  the  localized 
RX  algorithm.5’6  The  new  method  incorporates  robust  esti¬ 
mates  of  the  filter’s  parameters.  Where  the  RX  algorithm 
involves  the  movement  of  a  window  through  the  pixels  of 
an  image  while  computing  localized  statistics,  the  proposed 
method  computes  its  scores  relative  to  a  robust  parameter 
set  computed  for  clusters  of  pixels  within  the  image. 
Related  work  can  be  found  in  Taitano  et  al.7  In  Section  2,  a 
little  more  background  is  given  for  readers  unfamiliar  with 
hyperspectral  imaging. 

2.  Introduction 

Digital  photographs  taken  from  aircraft  or  satellites  can  be 
used  for  a  wide  range  of  military  and  civilian  applications, 
such  as  locating  a  tank  in  a  field  or  establishing  the  presence 
of  a  certain  type  of  foliage.  Several  methods  exist  to  locate 
anomalies  in  an  image;  for  instance,  highly  trained  individu¬ 
als  view  the  photograph  with  the  human  eye,  or  data  from 
the  image  is  analyzed  using  either  local  or  global  anomaly 
detectors.  The  first  method  can  prove  to  be  very  difficult, 
especially  in  a  highly  cluttered  area.  It  can  also  be  extremely 
time  consuming,  because  the  area  of  interest  has  to  be  pho¬ 
tographed  and  sent  to  an  imagery  expert,  who  then  manually 
analyzes  the  image  to  determine  if  there  are  anomalies.  The 
second,  and  possibly  more  effective  method,  is  to  analyze 
the  data  from  the  image  using  an  image-processing  algo¬ 
rithm  known  as  an  anomaly  detector. 

A  hyperspectral  image  is  similar  to  a  photograph  taken 
from  an  ordinary  digital  camera;  however,  a  hyperspectral 


image  may  contain  data  from  more  than  250  wavelength 
bands  from  the  electromagnetic  (EM)  spectrum,  which 
includes  some  non-visible  bands,  whereas  a  standard  digital 
camera  collects  data  from  only  three  bands  in  the  visible 
spectrum,  that  the  human  eye  sees  as  red,  green,  and  blue. 
These  images  are  made  by  specialized  cameras  placed  on, 
say,  an  aircraft  within  the  Earth’s  atmosphere  or  on  a  satel¬ 
lite  in  space.  The  image  is  divided  up  into  pixels  and  the 
magnitude  of  the  signal  for  each  band  is  recorded  for  each 
pixel.  The  number  of  pixels  in  an  image  depends  on  the 
resolution  of  the  camera.  An  image  that  captures  fewer  than 
20  bands  of  the  spectrum  referred  to  as  a  multispectral 
image,  and  an  image  with  20  or  more  bands  is  called  a 
hyperspectral  image.  All  of  this  data  is  then  stored  in  a 
three-dimensional  hypennatrix*  referred  to  as  a  data  cube, 
with  the  first  two  dimensions  of  the  hypermatrix,  x  and  y, 
being  the  location  of  the  pixel  in  the  image,  and  the  third 
dimension,  z,  being  the  magnitude  at  each  of  the  recorded 
EM  bands.  The  image  can  be  thought  of  as  a  series  of  vec¬ 
tors,  one  for  each  pixel  location,  that  contains  the  wave¬ 
length  magnitudes  for  each  of  the  bands. 

Consequently,  HSI  data  can  be  analyzed  using  standard 
multivariate  statistical  techniques,  and  anomalies  may  be 
found  by  locating  outliers  within  the  data.  Certain  tech¬ 
niques  specific  to  locating  anomalies  in  an  image,  such  as 
global  anomaly  detectors,  work  most  efficiently  when 
applied  to  homogenous  datasets.  Therefore,  if  data  are 
being  analyzed  for  the  presence  of  anomalies  in  an  image 
containing  more  than  one  main  feature,  such  as  a  field  with 
a  road  running  through  it,  cluster  analysis  must  be  accom¬ 
plished  prior  to  using  a  global  anomaly  detector,  or  the 
detector  may  detemiine  the  road  is  the  anomaly  in  the 
image,  and  true  anomalies  may  be  overlooked.  When  per¬ 
formed  properly,  cluster  analysis  splits  the  data  into  the 
requested  number  of  subsets,  known  as  clusters,  allowing 
global  anomaly  detectors  to  analyze  each  cluster  individu¬ 
ally  to  produce  the  best  results.  In  this  paper,  we  will  pro¬ 
pose  the  use  of  the  closest  distance  to  center  (CDC) 
algorithm9  in  conjunction  with  the  ellipsoidal  multivariate 
trimming  (MVT)  algorithm10  as  a  method  for  finding 
anomalies  in  hyperspectral  images.  We  call  this  new  method 
‘screened  MVT’.  The  purpose  of  this  paper  is  to  demon¬ 
strate  that  some  standard  tools  used  in  process  control  can 
be  readily  adapted  to  a  new  problem  area. 

The  paper  is  organized  as  follows.  Firstly,  we  present  a 
brief  overview  of  the  area  of  HSI.  Next,  we  discuss  the  need 
for  dimensionality  reduction  and  begin  the  development  of 
a  CDC/MVT  anomaly  detector,  screened  MVT.  The  pro¬ 
posed  method  is  tested  on  a  set  of  hyperspectral  images, 
and  the  paper  is  closed  with  a  summary. 

3.  Hyperspectral  imagery 

To  gain  a  basic  understanding  of  HSI,  we  can  begin  with  a 
discussion  of  the  common  digital  camera  that  has  become 
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Figure  I.  Example  of  the  bands  of  the  electromagnetic  spectrum  used  in  hyperspectral  imagery.11 


ubiquitous  in  modem  society.  Conceptually,  when  we  use  a 
digital  camera  to  take  a  color  photograph,  the  camera 
divides  the  imaged  scene  into  a  two-dimensional  grid  of 
pixels.  For  each  pixel,  three  pieces  of  information  are  col¬ 
lected.  These  are,  respectively,  the  amount  of  energy  ema¬ 
nating  from  the  pixel  in  the  red,  green,  and  blue  portions  of 
the  EM  spectrum.  This  information  is  stored  in  three  sepa¬ 
rate  two-dimensional  arrays.  For  any  given  pixel,  combin¬ 
ing  its  respective  red,  green,  and  blue  information  produces 
the  true  color  of  the  pixel.  Of  course,  viewing  the  array  of 
colored  pixels  on  a  computer  screen  or  in  its  printed  form 
reveals  the  scene  originally  photographed. 

If  we  image  a  scene  for  the  purpose  of  identifying  differ¬ 
ent  objects  that  it  may  contain,  a  simple  color  image  pro¬ 
duced  by  a  digital  camera  may  suffice;  however,  a  true -color 
image  has  its  limitations.  For  example,  vegetation  and  cam¬ 
ouflage  nets  may  both  appear  green,  making  it  very  difficult 
for  the  human  eye  -  or  worse,  for  the  computer  -  to  dis¬ 
criminate  one  from  the  other.  As  seen  in  Figure  1,  it  is 
important  to  note  that  the  visible  spectrum  of  light  is  only  a 
small  fraction  of  the  total  EM  spectrum  that  may  be  detected. 

To  address  this  limitation  of  true-color  imagery,  hyper¬ 
spectral  sensors  collect  information  beyond  the  visible 
region  of  the  EM  spectrum.  Just  as  a  digital  camera  pro¬ 
duces  three  images  for  wavelength  bands  corresponding  to 
red,  green,  and  blue  light,  a  hyperspectral  sensor  produces 
images  for  many  different  contiguous  wavelength  bands, 
typically  spanning  the  visible  to  near-infrared  regions  of  the 
EM  spectrum.  The  number  of  image  bands  collected  by  a 
sensor  can  range  from  20  to  over  500. 


Consider  the  M  x  N  pixelated  scene  of  Figure  2.  The 
hyperspectral  sensor  can  be  thought  of  as  producing  P  dif¬ 
ferent  images,  one  for  each  band  it  collects.  This  collection 
of  pixel-by-band  information  is  often  called  an  ‘image 
cube’.  For  m  =  1, ...  ,M  and  n  =  1, ... ,  N  the  pixel  in  row  m, 
column  n  of  band  1  refers  to  the  same  spatial  location  of  the 
scene  as  the  pixels  in  row  m ,  column  n  of  every  other  band 
in  the  image  cube.  The  sensor  reading  for  a  pixel  in  row  m, 
column  n,  and  band  X  =  1,  ...  ,  P,  can  be  referred  to  by  the 
variable  xmnk.  For  a  given  pixel  address  (m,n),  we  can  form 
the  vector 


(1) 


This  vector  is  often  referred  to  as  a  pixel  vector.  If  we  take 
the  transpose  of  all  the  pixel  vectors  in  the  image  and  place 
them  in  an  ( M  x  N)  x  P  array,  we  form  the  data  matrix,  X, 
which  is  commonly  used  in  multivariate  statistical  analysis. 

Using  X,  we  are  free  to  analyze  the  image  data  using 
multivariate  analysis  methods  such  as  principal  component 
analysis  (PCA),  cluster  analysis,  maximum  likelihood  clas¬ 
sification,  discriminant  analysis,  and  others. 

In  this  paper,  all  real  test  images  are  taken  from 
the  COMPact  Airborne  Spectral  Sensor  (COMPASS) 
and  Hyperspectral  Digital  Imagery  Collection  Equipment 
(HYDICE)  sensors.  The  COMPASS  sensor  is  able  to  receive 
data  on  an  area  at  255  different  wavelengths  of  light  across 
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Figure  2.  The  basic  hyperspectral  imaging  process  and  data  representation. 


the  EM  spectrum.  Synthetic  images  were  also  employed  in 
this  research.  They  are  described  in  a  subsequent  section. 

4.  Decreasing  the  dimensionality  of 
the  dataset 

Since  HS1  is  characterized  by  large  volumes  of  data  (over 
28,800  pixels  taken  at  over  200  wavelengths  in  the  smallest 
example  used  in  this  paper),  it  is  of  practical  necessity  to 
decrease  the  dimensionality  of  the  dataset.  This  is  accom¬ 
plished  by  PCA.1112  Other  compression  methods,  such  as 
the  use  of  wavelets,  have  been  proposed.13-14  Here,  PCA  is 
employed  since  it  is  relatively  simple  to  apply  and,  argu¬ 
ably,  it  is  the  standard  compression  method  used  in  prac¬ 
tice.  It  is  well  known  that  PCA  can  decrease  the  dimension 
of  the  data  significantly  while  still  maintaining  the  ability  to 
explain  variability  in  the  dataset.15  PC  scores  are  found  by 
projecting  the  data  onto  eigenvectors  of  their  correlation/ 
covariance  matrix.  For  the  purposes  of  this  paper,  we  deter¬ 
mined  the  number  of  PC  components  to  retain  by  using 
Kaiser’s  criteria15  on  all  images  so  that  each  set  that  was 
originally  of  dimension  ( M  x  N)  x  P  was  decreased  to 
dimension  (M  x  N)  x  r,  where  r  «  P.  The  use  of  PCA  for 
finding  outliers  in  multivariate  data  is  surveyed  by 
Gnanadesikan  and  Kettenring16  and  Rao.17 

As  alluded  to  earlier,  rather  than  attempting  to  find 
anomalies  across  entire  images,  the  images  were  first  clus¬ 
tered  into  homogeneous  spectral  groups  using  an  algorithm 
called  X-means.18  X-means  is  a  clustering  technique  that 


uses  an  iterative  scheme  to  find  the  proper  number  of  clus¬ 
ters  and  in  turn  perform  the  cluster  binning  based  upon 
Bayesian  information  criterion  (BIC)  scores.19 

5.  Outlier  detection  overview 

Certain  outlier  detection  methods,  such  as  MVT  are  known 
to  be  unreliable  due  to  their  use  of  the  Mahalanobis  distance 
in  determining  an  initial  mean  vector  and  covariance  matrix 
estimate.910  The  CDC  algorithm  has  been  employed  to  alle¬ 
viate  this  problem  by  determining  a  more  robust  initial  start¬ 
ing  point  for  mean  vector  and  covariance  estimation;9  the 
starting  point  being  more  compatible  to  the  set  of  good  data 
(without  the  outliers  present)  and  with  the  object  being  to  use 
these  estimates  to  begin  MVT.  Applying  MVT  with  the  CDC 
algorithm  as  an  initial  starting  point  should  perform  signifi¬ 
cantly  better  than  simply  using  MVT  when  there  are  multi¬ 
ple  outliers  in  the  data.9  CDC/MVT  seeks  to  trim  out  the  bad 
data  points  to  obtain  more  robust  estimates  of  the  covariance 
matrix  and  the  mean  vector.  Such  a  procedure  provides  a 
more  accurate  Mahalanobis  distance  measurement  that  can 
be  used  to  advantage  to  spot  outlying  observations. 

6.  The  CDC/MVT  algorithm 

Herein  the  data  have  been  pre-processed  using  PCA  and  a 
transformed  dataset  of  significantly  lower  dimensionality  is 
generated.  The  initial  starting  point  for  MVT  is  found  by 
performing  the  CDC  algorithm  on  the  transformed  dataset 
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to  find  ‘good’  estimates  for  the  mean  and  the  covariance. 
These  estimates  are  rendered  by  finding  the  nil  observa¬ 
tions  that  are  closest  to  the  median  vector  using  either 
Euclidean  distance  (2-norm)  or  the  largest  component 
absolute  value  difference  from  the  centroid  (max  norm)  and 
determining  the  mean  and  covariance  from  this  subset.9 
Once  these  estimates  are  found  we  then  begin  MVT. 

MVT  is  an  iterative  process  that  based  upon  a  percentile 
(50%  for  Chiang’s  algorithm9)  of  the  smallest  observations 
of  Mahalonobis  distance  within  a  sample.  These  observa¬ 
tions  are  used  to  determine  a  new  mean  vector  and  covari¬ 
ance  matrix.  This  process  is  repeated  using  the  most-recent 
mean  vector  and  covariance  matrix  until  the  mean  vector 
and  covariance  matrix  have  stabilized.  Once  the  iterations 
are  complete,  the  resulting  Mahalanobis  distance  for  each 
data  point  is  then  used  for  outlier  determination.9  The  litera¬ 
ture  varies  slightly  in  one  regard  during  this  process.  While 
Chiang  et  al.9  state  that  the  mean  vector  and  covariance 
matrix  must  stabilize,  Delvin  et  al.20  recommend  using  the 
stabilization  of  only  the  correlation  matrix  as  the  stopping 
criterion.  In  addition,  Delvin  et  al.20  recommend  using  a 
difference  of  10  3  as  the  stabilization  criteria  within  the  cor¬ 
relation  matrix  or  a  maximum  of  25  total  iterations.  Now, 
the  50th  percentile  for  retention  used  in  MVT  is  due  to  the 
low  breakdown  point  of  50%  outliers  for  the  algorithm. 
Devlin  et  al.20  propose  a  different  value  for  retention  in 
MVT  in  which  the  percentile  is  equal  to  100  x  (1-1/ 
(p+ 1))%,  where  p  is  equal  to  the  dimensionality  of  the  data¬ 
set.  in  this  paper,  we  also  propose  using  a  higher  percentile 
for  retention  in  MVT  based  upon  the  assumption  that  there 
are  significantly  fewer  outliers  than  background  pixels  in 
any  given  hyperspectral  image  cluster.  We  call  this  pro¬ 
cedure  screened  MVT.  The  percentage  to  retain  is  based 
upon  first  taking  the  Mahalanobis  distances  found  after  the 
CDC  algorithm  and  computing  a  conservative  percentile 
from  a  fitted  gamma  distribution  (maximum  likelihood 
parameter  estimates)21  with  aretain  =  10_1.  Next,  the  percent¬ 
age  to  retain  in  MVT  is  determined  by  the  number  of  obser¬ 
vations  that  fell  beyond  this  percentile. 

The  CDC/MVT  algorithm  is  very  similar  to  the  blocked 
adaptive  computationally  efficient  outlier  nominators 
(BACON)  algorithm  described  by  Billor  et  al.22  and  adapted 
for  hyperspectral  image  processing  by  Smetek  and  Bauer23  in 
that  its  overall  goal  is  to  trim  the  dataset  so  that  the  true  cova¬ 
riance  structure  of  the  data  can  be  determined.  Observations 
that  are  outliers  will  subsequently  have  much  larger  distance 
estimates  and  should  be  found  easily  in  outlier  determination 
after  iterative  estimates  have  stabilized.  The  CDC  algorithm 
and  our  ‘screened  MVT’  algorithm  are  detailed  below. 

6.  I  The  CDC  algorithm 

Input.  An  n  x  r  matrix  X  of  PC  scores  from  hyperspectral 
data. 


Output'.  An  initial  estimate  of  the  mean,  p0,  and  cov¬ 
ariance,  X0,  of  the  cluster  set  based  upon  the  closest 
nil  observations  to  the  median. 

Step  1  :  The  median  vector  of  the  data  is  obtained. 
Step  2  :  Determine  the  nil  observations  that  are  clos¬ 
est  to  this  median  vector  using  either  Euclidean  dis¬ 
tance  or  the  max  norm  distance,  where  n  is  equal  to 
the  size  of  the  dataset. 

Step  3  :  From  the  nil  observations  find  estimates  for 
the  mean  vector,  pc,  and  covariance  matrix,  Xc  The 
mean  vector  and  covariance  matrix  are  then  used  as  a 
starting  point  for  MVT. 

6.2  The  screened  MVT  algorithm 

Input.  An  nxr  matrix  X  of  PC  scores  from  hyperspec¬ 
tral  data  and  initial  estimates  of  //  and  X  from  CDC. 
Output'.  Mahalanobis  distance  calculations  for  the 
corresponding  n  data  points  in  the  cluster  set. 

Step  1  :  Determine  the  /x  and  X  for  the  dataset  via  CDC 
(as  shown  above).  These  are  nc  and  Xc,  respectively. 
Step  2:  Compute  Mahalanobis  distances  for  each 
observation,  i  =  1,  ...  ,  n.  using  pc  and  Xc: 

d,  (//,  x)  =yu,-A-)lxr_1(x,-A-) 

Step  3:  Determine  the  percentage  to  retain  in  MVT  by: 

[A]  fitting  a  Gamma  distribution  to  the  d: 
(p,  X)  Let 

F(z)  =  Pr {af,  (p,  X)  <  z}  for  all  real  z 
denote  the  Gamma  c.d  f.  that  is  fitted  to  the  \dt 
(p,  X):  1  =  1, ... ,  n}  of  Mahalanobis  distances; 

[B]  find  the  quantile,  associated  with  a 
conservative 

“retain  =  101,  that  is, 

d:T=F-'(1-a„,J; 

[C]  let  m/„  be  the  percentage  to  retain  in  MVT 
where  m  is  the  number  of  d,  (jU,  X)  <  d[ea“. 

Step  4:  Take  the  corresponding  observations  that  fall 
below  the  retention  percentile,  </'etam,  as  determined  in 
Step  3  for  computation  of  new  estimates  for  li  and  X. 
Step  5:  Compare  the  new  estimates  for  ll  and  X  to  that 
of  the  previous  iteration.  Return  to  Step  4  if  the  maxi¬ 
mum  absolute  difference  between  estimates  is  above 
a  user-defined  threshold  and  MVT  has  iterated  fewer 
than  25  times.  Else,  proceed  to  Step  6. 

Step  6  :  Declare  observations  as  outliers  by  compar¬ 
ing  distances  to  an  empirical  distribution  function  of 
testing  data.  The  cutoff  aouther  is  typically  chosen  to  be 
of  the  order  1(H  or,  as  will  be  seen  in  subsequent 
results,  aoutlier  can  be  varied  to  produce  operating 
characteristic  (OC)  curves. 

In  application,  these  algorithms  are  processed  sequentially 
and  for  each  cluster  set  within  an  image. 
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7.  Application  to  anomaly  detection  in 
hyperspectral  imagery 

In  this  research,  we  are  looking  for  anomalies  in  a  hyper¬ 
spectral  image.  The  assumption  is  that  these  anomalies  are 
manmade  and  constitute  spectral  outliers  in  a  statistical 
sense.  To  this  end,  we  are  not  concerned  with  natural  anom¬ 
alies  that  may  appear  to  be  outliers  when  compared  to  their 
surroundings/clusters.  As  seen  in  the  synthetic  hyperspectral 
image  (Figure  3),  there  are  many  observations  that  may  be 
considered  outliers  in  the  picture  (e.g.  trees  in  a  grass  field). 

To  alleviate  the  problem  of  having  many  natural  outliers 
dispersed  throughout  the  image,  the  images  are  first  clustered 
using  X-means.18  Once  the  image  has  been  clustered,  we  are 
in  a  position  to  look  for  outliers  within  relatively  homoge¬ 
nous  datasets.  As  seen  in  Figure  4,  the  outliers  in  the  image 
are  shown  in  green  (black)  with  a  yellow  (white)  border. 

The  task  is  to  identify  as  many  of  the  green  target  pixels 
as  possible  (true  positives),  while  minimizing  the  identifi¬ 
cation  of  non-target  pixels  as  targets  (false  positives). 

8.  Testing  results  and  analysis 

We  compared  the  results  for  21  different  hyperspectral 
images  using  three  forms  of  MVT  retention  (detailed 
below).  The  results  from  these  three  algorithms  were  com¬ 
pared  to  output  for  the  BACON  algorithm22’22  for  the  same 
images  to  determine  the  efficacy  of  each  MVT  algorithm 
against  a  robust  baseline  algorithm. 

The  images  used  for  testing  were  taken  from  the  Air 
Force’s  Airborne  Remote  Sensing  Program  (ARES),  and 
synthetic  images  created  using  the  Digital  Imaging  and 


Remote  Sensing  Image  Generation  (DIRSIG)  program.24 
The  ARES  images  were  acquired  through  testing  of  the 
HYDICE  sensor  during  the  Forest  Radiance  I  and  Desert 
Radiance  II  data  collection  efforts.  The  images  consist  of 
manmade  objects  such  as  vehicles,  panels,  camouflage 
nets,  and  tables.  For  all  real  images,  the  locations  of  objects 
of  interest  were  determined  during  collection.  The  synthetic 
images  employed  here  were  created  at  the  same  hypotheti¬ 
cal  geographical  location  with  differences  in  time  of  day, 
sensor  view  angle,  visibility,  and  target  size.  The  reason  for 
using  synthetic  images  in  our  testing  is  that  currently  there 
are  not  a  great  number  of  ‘truthed’  hyperspectral  images 
available  to  the  general  research  community.  The  DIRSIG 
program  is  able  to  produce  hyperspectral  images  that  are 
representative  of  real-world  images,  and  afford  the  advan¬ 
tage  of  allowing  the  user  to  specify  the  exact  nature  and 
location  of  all  the  anomalies  in  the  image.  The  synthetic 
images  used  were  all  different  variations  of  Figure  3. 

OC  curves  were  found  by  processing  the  resulting 
Mahalanobis  distances  and  plotting  the  estimates  for  the 
true  positive  rate  (at  the  pixel  level)  against  the  false  posi¬ 
tive  rate.  An  additional  measure  was  taken  as  the  area 
underneath  the  OC  curve.  An  example  OC  curve  for  the  Air 
Force  image  is  given  in  Figure  5. 

In  Figure  5,  screened  MVT  has  the  largest  area  under  the 
OC  curve  (AUC).  The  AUCs  for  all  algorithms  are  given  in 
Table  1. 

The  OC  curves  were  obtained  by  varying  aoutlier.  Each 
OC  curve  was  only  considered  up  to  a  false  positive  rate  of 
0.05,  since  rates  higher  than  0.05  would  render  target  detec¬ 
tion  worthless  based  upon  the  preponderance  of  back¬ 
ground  pixels  in  any  given  image. 
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Figure  5.  Operating  characteristic  curve  example  for  Air  Force  image.  MVT:  ellipsoidal  multivariate  trimming,  BACON:  blocked 
adaptive  computationally  efficient  outlier  nominators. 


Table  I.  Area  under  the  operating  characteristic  curve  (AUC) 
values  from  Figure  5 


Area  under  the  OC  curve 

Screened 

50%  MVT 

100  x  (1  -  l/(p  + 

BACON 

MVT 

retention 

l))%  retention 

0.8646 

0.7685 

0.8351 

0.7792 

MVT:  ellipsoidal  multivariate  trimming,  BACON:  blocked  adaptive 
computationally  efficient  outlier  nominators 


A  repeated  measures  analysis  of  variance  (ANOVA) 
procedure,  supplemented  by  the  Holm-Sidak  test  for  mul¬ 
tiple  comparisons  was  conducted  for  each  of  four  datas¬ 
ets.25  There  were  two  distinct  types  of  imagery  (synthetic 
and  real)  and  two  performance  measures  of  interest:  AUC 
and  computation  time  (in  seconds  with  all  processing  on  the 
same  physical  system).  The  four  treatments  are  the  algo¬ 
rithms  as  numbered  below: 

1.  Screened  MVT; 

2.  MVT  with  Chiang’s  50%  retention; 


Table  2.  Summary  of  the  repeated  measures  analysis  of  variance 
procedure  with  the  Holm-Sidak  test  for  multiple  comparisons 


Image  type  by 
performance  measure 

Significant 

treatments? 

Significant 

contrasts 

Real/AUC 

No 

Real/Time 

Yes 

1  -2/2— 4/2-3 

Synthetic/AUC 

Yes 

1-3/ 1-2/ 1-4 

Synthetic/Time 

Yes 

1  -2/2-372-4/ 1  -4 

AUC:  area  under  the  operating  characteristic  curve 


3.  MVT  with  100  x  (l-l/(p+l))%  retention; 

4.  BACON. 

The  subjects  are  the  images.  There  were  15  synthetic  images 
and  six  real  images.  Table  2  summarizes  the  analysis. 

As  is  evident  from  Table  2,  the  methods,  as  applied  to 
the  real  imagery,  showed  significant  differences  due  to  the 
treatments  only  where  time  was  concerned.  Examination  of 
Figure  6  shows,  as  expected,  that  using  a  more  flexible  per¬ 
centile  for  retention  within  MVT,  in  general,  resulted  in  a 
larger  AUC.  As  seen  in  Figure  6,  screened  MVT  performed 
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Figure  6.  Area  under  the  operating  characteristic  curve  output  for  real  images  using  four  algorithms. 
MVT:  ellipsoidal  multivariate  trimming,  BACON:  blocked  adaptive  computationally  efficient  outlier  nominators. 


Table  3.  Mean  performance  of  the  procedures  for  the 
real  images 


Treatment 

Average  AUC 

Average  time 

Screened  MVT 

0.76 

63 

MVT  with  Chiang's  50% 
retention 

0.62 

150 

1 00  X ( 1  --U)% 
MVT  \  P  +  V 

0.74 

72 

retention 

BACON 

0.74 

64 

AUC:  area  under  the  operating  characteristic  curve,  MVT:  ellipsoidal 
multivariate  trimming,  BACON:  blocked  adaptive  computationally 
efficient  outlier  nominators 


the  best  for  three  of  the  six  real  images.  It  was  also  observed 
that  in  most  cases  it  took  less  time  to  complete  the  algo¬ 
rithm.  Average  responses  are  recorded  in  Table  3. 

A  more  distinct  separation  of  algorithms  was  observed 
for  the  synthetic  images.  As  depicted  in  Figure  7,  screened 
MVT  performed  the  best  for  11  of  the  15  images  tested.  It 
must  be  noted  that  the  procedures  performed  poorly,  in  gen¬ 
eral,  across  the  synthetic  images.  This  was  largely  due  to 


Table  4.  Mean  performance  of  the  procedures  for  the 
real  images 


Treatment 

Average  AUC 

Average  time 

Screened  MVT 

0.34 

67 

MVT  with  Chiang’s  50% 
retention 

0.16 

197 

1 00  X ( 1  --LW 

MVT  \  P+'l 

0.15 

93 

retention 

BACON 

0.23 

1  16 

AUC:  area  under  the  operating  characteristic  curve,  MVT:  ellipsoidal 
multivariate  trimming,  BACON:  blocked  adaptive  computationally 
efficient  outlier  nominators 


the  presence  of  many  natural  anomalies  in  the  image  that 
were  not  clustered  into  their  own  cluster  set.  Although  these 
observations  were  outliers  within  their  cluster  set,  they 
were  not  the  anomalies  of  interest. 

Significant  contrasts  were  noted  for  screened  MVT  ver¬ 
sus  the  other  procedures  in  terms  of  AUC  (Tables  2  and  4). 
Significant  differences  in  processing  times  were  also  found, 
as  indicated  in  Table  2. 
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Figure  7.  Area  under  the  operating  characteristic  curve  output  for  real  images  using  the  four  algorithms. 
MVT:  ellipsoidal  multivariate  trimming,  BACON:  blocked  adaptive  computationally  efficient  outlier  nominators. 


9.  Summary 

By  utilizing  a  more  flexible  estimate  for  MVT  retention  via 
PCA  screening  we  were  able  to  improve  upon  the  algorithms 
that  use  the  standard  set  retention  percentiles  of  50%  and 
100  x  (l-l/(p+l))%  within  MVT.  This  was  accomplished 
while  maintaining  an  analysis  approach  that  was  relatively 
‘hands  off’.  We  believe  the  approach  is  promising  largely 
because  no  two  images  are  alike  in  all  aspects.  By  maintain¬ 
ing  a  more  flexible  trimming  percentage,  we  were  able  to 
avoid  some  of  the  swamping  and  masking  effects  that  were 
present  in  the  rigid  MVT  settings  that  use  fewer  of  the  avail¬ 
able  ‘good’  observations  for  MVT  retention. 

There  were  many  natural  anomalies  in  the  D1RSIG- 
created  images  that  resulted  in  a  fairly  significant  swamp¬ 
ing  effect  for  both  the  BACON  and  MVT  algorithms.  The 
BACON  algorithm  suffered  more  from  the  presence  of 
these  anomalies,  though,  due  to  the  trimming  process  used. 
This  is  because  these  natural  anomalous  observations  were 
not  included  in  the  estimates  of  the  mean  and  covariance  in 
the  trimming  process  since  they  were  considered  outliers 
for  the  clusters  in  which  they  were  located.  Even  though 
these  observations  should  have  been  trimmed  since  they 
do  not  fit  into  the  clusters,  they  are  not  targets  of  interest. 


Furthermore,  not  including  these  observations  in  the  dataset 
tended  to  tighten  the  estimates  for  the  covariance  and  the 
mean  so  that  their  Mahalanobis  distances  were  inflated 
and,  thus,  they  tended  to  have  distance  estimates  that  looked 
like  targets. 

In  this  paper  we  examine  the  efficacy  of  using  the  CDC 
algorithm  in  conjunction  with  MVT  to  find  outliers  in  a 
hyperspectral  image.  A  method  is  advanced  to  create  a  flex¬ 
ible  retention  percentage  that  more  adequately  reflects  the 
actual  number  of  outlier-free  observations,  thereby  allow¬ 
ing  one  to  form  robust  estimates  of  the  mean  and  covari¬ 
ance  matrix  that  may  more  effectively  decrease  the  effects 
of  swamping  and  masking  as  compared  to  using  a  set  per¬ 
centile  for  retention.  The  effectiveness  of  these  ideas  is 
demonstrated  against  real  and  synthetically  generated  HSI. 
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